*Target (protein/gene name): Serine/threonine specific protein phosphatase

*NCBI Gene # or RefSeq#: NCBI gene # 5811

*Protein ID (NP or XP #) or Wolbachia#: None on PDB Page

*Organism (including strain): Toxoplasma gondii (strain ME49)

Etiologic Risk Group (see link below): Group A

*Background/Disease Information (sort of like the Intro to your Mini Research Write up):
A single-celled parasite called Toxoplasma gondii causes a disease known as toxoplasmosis. While the parasite is found throughout the world, more than 60 million people in the United States may be infected with the Toxoplasma parasite. Of those who are infected, very few have symptoms because a healthy person's immune system usually keeps the parasite from causing illness. However, pregnant women and individuals who have compromised immune systems should be cautious; for them, aToxoplasma infection could cause serious health problems. [1]

Link to TDR Targets page:
http://www.tdrtargets.org/search?form=simple&form_submit=yes&query=Toxoplasma+gondii

Link to Gene Database page (NCBI, EuPath databases -e.g. TryTryp, PlasmoDB, etc - or PATRIC, etc.)
http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?lvl=0&id=5811

Essentiality of this protein:
Gene Ontology
GO:0000159 protein phosphatase type 2A complex

Function: (mouse over links to read term descriptions)
GO:0008601 protein phosphatase type 2A regulator activity
GO:0005488 binding

Complex of proteins?: No

Druggable Target (list number or cite evidence from a paper/database showing druggable in another organism):
No chemical compounds

*EC#: 3.1.3.16
Screen Shot 2014-05-03 at 2.06.42 PM.png

Enzyme Assay information (spectrophotometric, coupled assay ?, reagents):
-- link to Sigma (or other company) page for assay (see Sigma links below)
-- -or link (or citation) to paper that contains assay information
-- links to assay reagents (substrates) pages.
--- List cost and quantity of substrate reagents, supplier, and catalog #

Structure Available (PDB or Homology model)
-- PDB # or closest PDB entry if using homology model:
-- For Homology Model option:
---- Show pairwise alignment of your BLASTP search in NCBI against the PDB
---- Query Coverage:
---- Max % Identities:
---- % Positives
---- Chain used for homology:

Current Inhibitors: No chemical compounds associated to this protein

Expression Information (has it been expressed in bacterial cells): No

Purification Method: Affinity Chromatography 6 his tag

Image of protein (PyMol with features delineated and shown separately):
No PyMol image of protein

*Amino Acid Sequence:

MAGTSIFQGSSWARILRDAVQIIPGRFFFLSLTGTPWDTQSIHFFCTDQTFQYEPFFADF
GPLSLGCIYKYAKLLESKLKEAEERQHILIHYSAIHPEKRANAALLIGAAQMLLFGMSAQ
EAYRPFLNISPRFVPFRDATCGPCNFKLTILDCLKGLEFAMKLGWFDYKTFNVDEYDYYE
KLKNGDMNWIIPSRILAFSCPSSATGNHDGYSTCTPEDYANIFNSLGIKTVVRLNKKQYD
ARKFTDRNIEHVDLFFVDGTCPSREIIQAFLQVVENRDHPIAVHCKAGLGRTGTLIGCYA
IKNFKFPAVEWIGWNRLCRPGSILGPQQQFLTEIQHELLQMNRENSIHTRRLHASQPSKS
LPSSSPSQKDDLADCLAKLSLDSRRVAEHGDAGQGERLLVAKRQQQAGLSTAASSTSSSV
NGIDSGSQRKILPLPLLLSRPLSSNTAPFSEVTVLFNVHLLKLHTIRSLTFCRGIVSRC

*length of protein in Amino Acids: 479

Molecular Weight of protein in kiloDaltons: 53793.6

Molar Extinction coefficient of your protein at 280 nm wavelength: 53245

TMpred graph Image
Screen Shot 2014-05-03 at 2.39.31 PM.png

*CDS Gene Sequence :

ATGGCAGGGACAAGCATTTTTCAAGGCTCTTCATGGGCTCGCATTCTTCGCGATGCGGTC CAGATCATTCCAGGTCGTTTCTTTTTCCTCTCACTGACTGGCACTCCTTGGGATACCCAG TCCATACACTTCTTCTGCACAGATCAGACATTCCAGTATGAGCCATTCTTCGCAGACTTC GGACCTCTTAGTCTTGGTTGCATTTACAAATATGCAAAACTTCTTGAATCCAAGCTCAAA GAAGCAGAGGAACGGCAACACATTTTGATACACTATTCGGCCATTCATCCTGAAAAGCGA GCAAATGCTGCGCTCTTAATCGGGGCAGCTCAGATGCTTCTCTTCGGAATGTCCGCTCAA GAGGCATACAGGCCGTTTCTTAATATTTCACCTCGATTCGTTCCTTTCCGGGATGCAACA TGCGGACCGTGTAATTTCAAGCTGACAATTTTGGACTGCCTTAAGGGTCTGGAATTCGCG ATGAAACTTGGATGGTTTGACTACAAGACCTTCAATGTTGACGAGTACGACTACTATGAA AAGTTGAAAAACGGAGATATGAATTGGATTATCCCAAGCCGTATCCTGGCATTCTCTTGC CCTTCCAGCGCCACAGGCAACCACGACGGATACAGCACGTGCACCCCAGAAGACTACGCG AATATTTTCAATAGCCTAGGGATTAAAACAGTCGTACGGTTGAACAAGAAACAATATGAT GCCAGGAAGTTCACCGACAGGAACATCGAGCACGTGGATCTATTTTTCGTAGATGGTACC TGCCCCTCCAGAGAAATAATACAAGCATTTCTGCAGGTGGTCGAAAACCGAGATCATCCA ATTGCGGTGCACTGTAAAGCTGGCCTTGGCCGAACAGGGACCCTCATCGGATGCTATGCC ATCAAAAACTTCAAGTTCCCCGCAGTTGAGTGGATAGGCTGGAACCGATTGTGCAGGCCG GGTAGCATTTTGGGGCCACAACAGCAATTCCTTACTGAAATACAGCACGAGTTGCTTCAA ATGAATCGGGAGAATTCTATACACACACGCCGATTACATGCTTCACAGCCATCGAAATCG CTGCCAAGCTCTTCTCCGTCTCAAAAGGACGATCTTGCCGATTGCTTAGCGAAACTTTCC CTGGACAGTCGGCGAGTGGCGGAACATGGAGACGCTGGACAAGGCGAGCGACTGCTCGTT GCCAAGCGGCAACAGCAGGCAGGGCTGTCTACTGCTGCATCGTCTACGTCGTCAAGTGTA AATGGCATTGACTCGGGGTCCCAAAGAAAGATCCTTCCCCTTCCTCTTCTTCTCTCCCGT CCTCTATCTAGCAATACTGCTCCGTTCTCAGAAGTGACGGTGCTATTCAATGTGCATCTT TTGAAGCTTCACACCATCAGAAGCCTCACATTCTGTAGGGGAATTGTCTCGCGCTGTTGA

*GC% Content for gene: 48.6%

*CDS Gene Sequence (codon optimized):
ATGGCCGGAA CCTCTATTTT TCAGGGTTCT TCTTGGGCCC GGATTCTTCG CGATGCGGTC CAGATTATTC CAGGCCGTTT CTTTTTCCTC TCCCTGACTG GTACCCCCTG GGATACGCAG TCCATTCACT TCTTTTGCAC AGATCAAACG TTCCAGTACG AGCCTTTTTT CGCGGATTTT GGTCCATTGA GCTTAGGCTG TATCTACAAG TATGCCAAAT TATTAGAGTC AAAATTAAAA GAAGCAGAAG AGCGTCAGCA CATTCTCATC CATTACAGCG CAATTCACCC GGAAAAACGC GCCAATGCTG CATTGCTGAT TGGTGCGGCC CAGATGCTGC TTTTTGGAAT GAGCGCGCAG GAGGCATACC GCCCGTTTCT GAACATCAGC CCTCGGTTTG TGCCATTTCG TGATGCGACC TGCGGCCCTT GTAATTTTAA ATTGACGATC CTTGATTGCC TGAAAGGCCT GGAATTTGCC ATGAAACTGG GTTGGTTTGA CTATAAGACG TTCAATGTGG ACGAATACGA CTACTATGAA AAGTTAAAAA ACGGCGATAT GAATTGGATT ATTCCTAGTC GCATTCTGGC GTTCAGTTGT CCGTCCAGCG CGACGGGCAA CCATGATGGC TACTCCACGT GTACCCCGGA AGATTACGCA AATATTTTCA ACTCCCTGGG CATTAAAACC GTTGTTCGTT TAAATAAAAA ACAGTATGAC GCGCGCAAGT TCACTGATCG TAACATTGAG CATGTTGATC TGTTCTTTGT CGACGGGACA TGCCCGAGTC GGGAAATCAT CCAAGCGTTC CTGCAGGTTG TCGAAAACCG CGATCATCCG ATCGCGGTTC ATTGCAAAGC CGGCCTTGGC CGTACCGGCA CACTGATTGG TTGCTATGCG ATTAAGAATT TCAAATTCCC GGCCGTAGAA TGGATCGGAT GGAACCGGCT TTGCCGTCCG GGTAGCATTT TGGGGCCACA GCAGCAGTTC CTTACGGAGA TTCAGCACGA GCTCCTGCAG ATGAATCGGG AGAACTCGAT TCATACTCGT CGTCTGCATG CAAGCCAGCC GTCTAAAAGT CTGCCGTCAT CTAGCCCATC CCAGAAAGAT GACCTGGCTG ACTGTCTTGC GAAACTGTCA CTCGATAGCC GCCGTGTAGC AGAGCATGGA GACGCCGGGC AAGGTGAACG TCTGCTGGTT GCTAAACGCC AGCAACAGGC TGGTCTGTCA ACCGCAGCTT CTTCGACCTC CTCGAGTGTT AATGGCATCG ATTCAGGATC GCAACGTAAA ATCCTGCCAT TGCCGCTGCT CCTGTCCCGT CCTCTGTCAA GCAACACCGC GCCTTTCTCG GAGGTCACCG TTTTATTTAA CGTGCACCTG CTGAAATTAC ACACAATTCG CTCCCTCACC TTCTGTCGTG GCATTGTTTC CCGTTGTTAA

*GC% Content for gene (codon optimized): 50.1%




Resources:


See ProtocolTargetDiscoveryVDS.docx for more

Etiologic Risk Group Categories (for pathogens): http://www.utexas.edu/research/rsc/ibc/agent_class.html#_Toc7238334



SIGMA-ALDRICH RESOURCES

Enzyme Explorer

http://www.sigmaaldrich.com/life-science/metabolomics/enzyme-explorer.html



Enzyme Classification Index (EC number)

http://www.sigmaaldrich.com/life-science/biochemicals/biochemical-products.html?TablePage=14573088





WolframAlpha http://www.wolframalpha.com/

DrugBank http://www.drugbank.ca/





Databases of genes/organisms:

http://www.niaid.nih.gov/Pages/default.aspx

http://eupathdb.org/eupathdb/

https://patricbrc.vbi.vt.edu/portal/portal/patric/Home

http://www.nmpdr.org/FIG/wiki/view.cgi/Main/EssentialGenes

http://tubic.tju.edu.cn/deg/

http://csgid.org/csgid/cake/pages/community_request_gateway

http://tdrtargets.org/

http://gsc.jcvi.org/status.shtml





Scientific Nomenclature page from Center for Disease Control (gene, protein names and abbreviations)

http://wwwnc.cdc.gov/eid/pages/scientific-nomenclature.htm





Gene Information:

NCBI GENE Page: http://www.ncbi.nlm.nih.gov/gene

BLAST Page: http://blast.ncbi.nlm.nih.gov/



Protein Information:

NCBI Protein Page: http://www.ncbi.nlm.nih.gov/protein

Protein Expression Website

Protein Expression Paper: SGC_ProteinProductionPurificationNatMethods2008.pdf



Primer Overlap PCR Articles

HooverLubkowski_PCRoverlapcloninggnf042.pdf

StemmerPCRoverlapGene1995.pdf



Is my target good for Virtual Screening programs?

Reynolds_THermodynamicsLigandBinding_MedChemLett2011.pdf